Maximizing Learning Progress: An Internal Reward System for Development

نویسندگان

  • Frédéric Kaplan
  • Pierre-Yves Oudeyer
چکیده

This chapter presents a generic internal reward system that drives an agent to increase the complexity of its behavior. This reward system does not reinforce a predefined task. Its purpose is to drive the agent to progress in learning given its embodiment and the environment in which it is placed. The dynamics created by such a system are studied first in a simple environment and then in the context of active vision.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementing Bounded Linear Programming and Analytical Network Process Fuzzy Models to Motivate Employees: a Case Study

In this research, the factors affectinguniversity employees’ motivation and productivity are identified and classified in seven groups; the impact of each motivation factor on the productivity is presented by ANP fuzzy model.Eight universities in Iran were analyzed in this research work. The aim of this study is to explore the productivity of employees. This paper attempts to give new insights ...

متن کامل

O2: Neuroscience and Talent: How Neuroscience Can Enhance Successful Plan of Talent Strategy

Performance and development are based on hard work, experience and learning. Learning how to change different behaviors is crucial to successful talent management plans. Within the brain there are complex connected circuits that can identify threats. The brain reacts to change as a threat. There is also a collection of brain structures tied to a natural reward system that are involved in the re...

متن کامل

When Does Reward Maximization Lead to Matching Law?

What kind of strategies subjects follow in various behavioral circumstances has been a central issue in decision making. In particular, which behavioral strategy, maximizing or matching, is more fundamental to animal's decision behavior has been a matter of debate. Here, we prove that any algorithm to achieve the stationary condition for maximizing the average reward should lead to matching whe...

متن کامل

Learning to represent reward structure: a key to adapting to complex environments.

Predicting outcomes is a critical ability of humans and animals. The dopamine reward prediction error hypothesis, the driving force behind the recent progress in neural "value-based" decision making, states that dopamine activity encodes the signals for learning in order to predict a reward, that is, the difference between the actual and predicted reward, called the reward prediction error. How...

متن کامل

Multiplexing signals in reinforcement learning with internal models and dopamine.

A fundamental challenge for computational and cognitive neuroscience is to understand how reward-based learning and decision-making are made and how accrued knowledge and internal models of the environment are incorporated. Remarkable progress has been made in the field, guided by the midbrain dopamine reward prediction error hypothesis and the underlying reinforcement learning framework, which...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003